Senior Data Engineer
Dallas, TX
Long term contract
Job Description:
We are looking for a Data Engineer to join our Advanced Analytics team under the Wholesale organization. This engineer will design and deliver data solutions for the analytics and data science teams in the consumer organization. The engineer will be able to work with all types of interesting data: fiber network and outage data, billing, digital behavior, customer interactions, and customer demographic data.
In this role, you will design and build out the modern data infrastructure to streamline the data science workflows for the data science and overarching advanced analytics teams in the consumer organization. These teams are primarily responsible for building predictive models and driving retention and sales strategy for our consumer base.
Responsibilities:
• Develop and implement ETL pipelines to efficiently transfer data from SQL Server to Databricks, ensuring accuracy, scalability, and optimal performance.
• Validate and transform data within Databricks, including testing for data integrity, consistency, and quality across ingestion and processing stages.
• Collaborate with cross-functional teams to design, build, and optimize curated tables in Databricks for analytics and reporting purposes.
• Monitor and troubleshoot data workflows, addressing any issues related to performance, reliability, or data anomalies.
• Document data engineering processes and provide technical support for ongoing maintenance and enhancements to the Databricks environment.
Required Qualifications:
• 2+ years of experience as a Data Engineer in a similar role
• Experience with data modeling, warehousing and building pipelines
• Proven experience in designing and implementing comprehensive data pipelines for a variety of flows (data integration across systems, ETL processes, machine learning infrastructures)
• Proficient in SQL and Python
• Experience in Databricks ETL
• The more experience, the better, when it comes to the AWS ecosystem (e.g. GLUE, Athena, S3, Lambda, IAM, SageMaker, CloudWatch, API Gateway), Delta Lake, PySpark, Apache Spark, Airflow, APIs (REST, SOAP, RPC), streaming event data
Lokesh Kumar
Senior Consultant
Lokesh.Kumar@infovision.com